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Abstract. This contribution describes the "spectro-perfectionism" algorithm of Bolton 
& Schlegel (2010) that is being implemented within the Baryon Oscillation Spectro- 
scopic Survey (BOSS) of the Sloan Digital Sky Survey 111 (SDSS-111), in terms of its 
potential to deliver Poisson-limited sky subtraction and lossless compression of the in- 
put spectrum likelihood functional given raw CCD data. 

The Baryon Osc iUation Spe ctroscopic Survey (BOSS) of the Sloan Digital Sky 
Survey III (SDSS-III lEisenstein et al..,201 1.) is the largest extragalactic spectroscopic 
survey to date, both in the number of spectra being collected and in the volume of 
the universe being mapped. BOSS has been delivering survey-quality spectra since 
December 2009, and is on track to obtain spectra of more than 1.5 million galax- 
ies by mid-2014. While BOSS target galaxies are fainter than the galaxies observed 
in the original SDSS, the night-sky foreground is as bright as ever. This decreasing 
signal-to-foreground trend — which will onl y be exacerbated in f uture redshift surveys 
such as the proposed BigBOSS experiment dSchlegel et al.ll20l"l ) — demands a new ap- 



proach to spectroscopic data analysis. Th is contribution to the pro ceedings describes 
the "spectro-perfectionism" algorithm of ( Bolton & Schlegeill2010l hereafter B&S) in 



terms of its potential to deliver two important criteria for faint-object spectroscopy: 
(1) Poisson-limited night-sky foreground subtraction, and (2) lossless likelihood func- 
tional compression. The B&S algorithm is based upon extraction via forward-modeling 
to the raw CCD data using the full two-dimensional (2D) point-spread function (PSF) 
of the spectrograph, and is currently in the software implementation phase for eventual 
deployment within the BOSS survey. 

The current state of the art for ext raction of sp e ctra in opt ical astronorn y is the 
optimal extraction method described by iHewett et all (Il985h and iHome (Il986h . which 



models the spectrum in each successive CCD pixel row (corresponding to a particular 
effective wavelength) via a maximum-likelihood scaling of a spatial cross-sectional 
profile. The shortcomings of this method can best be understood by considering how it 
extracts an emission line, such as one of the OH features that are prominent redward of 
7000 A. If the PSF is a non-separable function of x and y in CCD coordinates (such as 
will be the case for a PSF consisting of a core plus wings, or a PSF with significant tilted 
ellipticity and/or skewness), then the actual spatial cross-sectional profile will depend 
upon the location of the cross section within the spectrum (e.g., a cross section through 
the wing will be broader than a cross section through the core). By extracting with a 
single mean profile p{x) rather than the true cross-sectional profile q{x) in a given row. 
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a fractional bias in the extracted counts is introduced which is given by 

-1 

(1) 

In practice this bias may be small, but when coupled to variations in the spectrograph 
PSF across the field of view of a multifiber spectrograph, it can result in significant 
systematic residuals after the subtraction of a model sky spectrum. For very faint galaxy 
spectroscopy, we wish to push these residuals well below the 1% level, and hence we 
are pursuing extraction using the mathematically correct forward model given by the 
2D PSF rather than the mean ID cross-section. Given a sufficiently accurate model for 
the PSF and its variation across the instrument, we can construct a noise-limited model 
for the raw CCD data, and can furthermore model the common sky spectrum upstream 
from the variation of the PSF. 

To realize the full benefit of 2D PSF-based extraction, we develop a mathemati- 
cally explicit answer to the question: what is a spectrum? Let the vector f represent an 
extracted ID spectrum in the sense commonly understood in astronomy. Its most im- 
portant function is to permit quantitative inference about the object under observation. 
For this purpose, the full information content of the spectrum is not only in the vector f , 
but also in the estimate of its noise properties and its resolution. We will represent the 
noise by the statistical covariance matrix C of the pixels in the spectrum, and the reso- 
lution by a matrix R that encodes the "line-spread function" (LSF) of the spectrum. If it 
is diagonal, C can be represented by a vector of pixel errors (and this is often assumed 
even when it is not true). The matrix R represents the instrumental "blurring" of an 
input spectrum, and is ideally band diagonal, with a bandwidth that is neither too large 
(oversampled) nor too small (undersampled). All the inferential power of the spectrum 
is encapsulated by the "hkehhood functional" of a model spectrum m given the data, 
which if we assume Gaussian noise can be expressed by the statistic: 

xlp^im I data) = (f - Rm)Tc-i(f - Rm) . (2) 

Let us now consider the raw data and calibrations from which f, C, and R are 
derived. Generally speaking, the predicted counts in CCD pixel / (where / is a single 
index that is understood to run over rows, columns, and multiple CCDs) may be related 
to the input model spectrum m(A) through the linear relation 

= J dAAi{A)m{A), (3) 

where A,(/l) is a transfer function for pixel i. If we assume sufficient finite sampling 

points of the spectrum and transfer function in the wavelength domain indexed by j, 
we may represent the operation of the telescope, instrument, and detector by a matrix 
relationship 

Ci^AijiUj. (4) 

Here, the matrix A, which we shall refer to as the "calibration matrix", generalizes 
and incorporates the spectrum trace solution, wavelength solution, 2D spectrograph 
PSF, relative and absolute throughput variation, variations in CCD pixel sensitivity, and 
all other calibration considerations that are generally considered separately from one 
another. The problem of accurately estimating A given standard cahbrations (i.e., arcs 



= I - I J" dx p(x)q(x) \ \^ dx q^(x) 
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and flats, as well as science frames for self-calibration) is perhaps the most challenging 
aspect of spectroscopic data reduction. The strategy for determining A is beyond the 
scope of this contribution; here we only note that in most instrumental contexts, the 
calibration matrix will have sufficient s ymmetry, spa rsity, and smoothness to make its 
estimation a well-constrained problem /Pandeyl 2011 ). 



If we assume that A has been determined, that we have a set of measured science 
CCD pixel counts represented by the vector p, and that the associated (uncorrelated) 
squared pixel errors constitute the diagonal raw-pixel covariance matrix N, the likeli- 
hood functional of any model input spectrum vector m is represented by 

xLim I data) = (p - AmfN-\p - Am) , (5) 

and in principle any and all inference can be made directly against the raw pixels. In 
practice this approach is generally both onerous and unnecessary. However, the obvious 
analogy between Equations |2] and [5] suggests the following operational definitions for 
three commonly understood stages of spectral data analysis: 

Calibration = Likelihood functional determination 
Extraction = Likelihood functional compression 
Measurement = Likelihood functional projection 

The quality of an extraction algorithm can thus be judged by the lossiness of its com- 
pression of the input-spectrum likelihood functional through the translation of (^^ 
specified by A, N, and p) into;^fspg^. (as specified by R, C, and f). 

To arrive at a lossless extraction algorithm, note first that we can approach Equa- 
tion |5]naively to determine a maximum-likelihood (minimum-;i'^) solution for the input 
spectrum m as 

m^,. - (ATn-1A)-1(ATn-')p . (6) 

However, this estimator has undesirable properties as an extracted spectrum. The square 
matrix (A^N~'A)~\ being the inverse of the symmetric banded non-negative matrix 
A^N~'A, will have significant bandwidth and large positive/negative fluctuations off 
the diagonal. Since this is the covariance matrix of the sample elements of nim.i., the 
estimated spectrum will exhibit significant ringing. This is an expected consequence of 
the deconvolution implicit in the solution for mm.i.. 

This deconvolution problem highlights an apparent advantage of the row-by-row 
optimal extraction algorithm: while row-by-row deconvolves the 2D spectrum image 
in the spatial direction (on the strong prior assumption that the input signal is a one- 
dimensional spectrum distributed according to a known profile), it does not deconvolve 
the instrumental profile in the wavelength direction, and hence introduces no ringing. 
The price, as seen above, is that this computation is carried out using a mathematically 
incorrect model for the projection of photons onto the detector. One solution in the con- 
text of 2D PSF extraction would be to impose a regularizing prior on mm.i.. However, 
regularization of the input spectrum model is the purview of science analysis, not of 
data reduction. If implemented at this stage, it would break the model of extraction as 
likelihood functional compression. 

To proceed, consider a diagonalization of the matrix A^N'^A as 



A^N^A = R^C^R , 



(V) 
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where R is a square matrix (in contrast to A, which has many more rows than columns) 
and C~' is a diagonal matrix. This diagonalization need not necessarily be in the eigen- 
decomposition sense. To factorize in the sense described in B&S, we first consider 
the matrix U whose columns are the eigenvectors of A^N^^A, and the corresponding 
diagonal matrix E of eigenvalues. Since the original matrix is real, symmetric, and 
positive definite, U is orthogonal, with its inverse equal to its transpose, and all its 
eigenvalues are real and positive. Now we define 



where E ' is a diagonal matrix whose entries are the positive square roots of the en- 
tries of E, and S is a diagonal matrix defined so that R is normalized to unity in each 
row when summed over columns. (The more intuitive "flux-conserving" normalization 
associated with a sum over rows is less straightforward to implement.) With this defini- 
tion of R and C, Equation His satisfied exactly. We have diagonalized the (inverse) co- 
variance matrix, opting not for the orthogonality of U, but rather for the band-diagonal 
locality of R, which derives from its relationship to the matrix square root of the band 
diagonal matrix A^N^'A. That is to say, the matrix R by our definition will "look like" 
a line-spread function in the sense understood in astronomical spectroscopy. 

We note in passing that the definition of R that we have adopted, while well moti- 
vated, is not unique. In fact, the motivation for a smoothly varying line spread function 
with wavelength may even be worth accepting a small degree of off-diagonal covari- 
ance. A desire for a flux-conserving definition of R may also motivate different choices. 
As noted by B&S, cross-talk between neighboring spectra of a multiobject spectrograph 
presents additional complication. The particular choices and approximations adopted 
will necessarily be driven by instrumental and scientific context. 

Finally, we define our extracted spectrum f as the "reconvolution" of our decon- 
volved extraction by the resolution matrix R, thereby removing the ringing: 



Now, using the definitions of R, C, and f given in Equations [8l-fT0l we find that;^fspg^. as 

given by Equation |2] is equal up to a constant offset to;^^^^^,, as given by Equation [Sj/or 
any trial model spectrum m. Thus we attain the goal of extraction as lossless likelihood- 
functional compression, with A, N, and p replaced respectively by the band-diagonal 
LSF matrix R, the diagonal spectrum covariance matrix C (i.e., an error vector), and 
the flux vector f. Equally important is the fact that R, C, and f collectively satisfy the 
intuitive understanding of "a spectrum" in the astronomer's sense! 
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f = Rmm.1. = R(ATn-'A)-1(A'^N"1)p . 



(10) 



